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Abstract 

We study the classification of cellular-automaton update rules into 
Wolfram's four classes. We start with the notion of the input entropy of a 
spatiotemporal block in the evolution of a cellular automaton, and build 
on it by introducing two novel entropy measures, one that is also based on 
inputs to the cells, the other based on state transitions by the cells. Our 
two new entropies are both targeted at the classification of update rules by 
parallel machines, being therefore mindful of the necessary communica- 
tions requirements; we call them cell-centric input entropy and cell-centric 
transition entropy to reflect this fact. We report on extensive computa- 
tional experiments on both one- and two-dimensional cellular automata. 
These experiments allow us to conclude that the two new entropies pos- 
sess strong discriminatory capabilities, therefore providing valuable aid in 
the classification process. 

Keywords: Classification of cellular automata. Input entropy. Parallel 
simulation of cellular automata. 

1 Introduction 

Cellular automata are discrete-time dynamical systems comprising finite-state 
units, called cells, whose states evolve in time as a result of the interactions with 
other cells. Since their introduction nearly five decades ago by von Neumann pP , 
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cellular automata have acquired an ever more prominent status as a modeling 
tool in several research areas (cf., e.g., |3 |3] and the references therein), and 
have even come to be regarded by some as a central abstraction in the modeling 
of nature's fundamental processes 0]. 

For S = {0, . . . , s — 1} the set of possible states, and for i > an integer, a 
cellular automaton with n cells evolves from time t to time t+1 by synchronously 
updating all n states by the application of a deterministic mapping Ff from 5" 
to S"-. This mapping Ff is global in nature and depends on the local update 
rule /, which dictates how each individual state is to be updated given the cell's 
state at time t as well as the states of those cells that lie within a neighborhood 
of size S. The update rule / is then a mapping from 5'^+'' to S. 

Normally a cell's neighborhood in a cellular automaton is determined by an 
underlying multidimensional lattice according to several possible criteria. For 
example, a cell's neighbors relative to a certain dimension of the lattice may 
be taken to be those cells that are r > edges away along that dimension but 
no edges away along any other dimension, r being usually referred to as the 
radius of the update rule in that dimension. For unit radii in all dimensions, 
this characterizes what is known as the von Neumann neighborhood, but in 
this paper we employ the same denomination also for greater radii. Another 
example neighborhood comes from letting two cells be neighbors of each other 
whenever one can be reached from the other by treading no more edges along 
a certain dimension than the update rule's radius along that dimension. For 
unit radii this is the Moore neighborhood, but once again we generalize and in 
this paper employ the same denomination under greater radii as well. When n 
is finite, it is customary to regard the lattice as having cylindrical boundaries, 
that is, as allowing every cell to have exactly two nearest neighbors along each 
dimension. 

Finite cellular automata, those for which n is finite, are necessarily such 
that Ff eventually leads to a fixed point, or a limit cycle, of configurations in 
5", that is, either x such that Ff{x) — x or xq, . . . ,Xp^i, with p > 0, such 
that Xq = Ff{xp-i), Xi = Ff{xo), and so on 0. The case of infinite cellular 
automata, on the other hand, is far more complicated and intriguing, since 
now n is formally infinite and no periodicity is guaranteed to emerge from the 
successive application of Ff. 

Both in the finite and in the infinite cases, cellular automata have along 
the years been the subject of theoretical and experimental analyses. For a 
summary of key results, the reader is referred, for example, to [HI El and to their 
many references. One of the most appealing topics of investigation has been the 
classification of the update rule /, and consequently of the cellular automata 
based on it, regarding its "complexity." 

Interest in this question received its initial impetus from the study by Wol- 
fram of infinite one-dimensional cellular automata [5], which resulted in the 
empirical finding that, nearly regardless of initial states, / consistently falls 
within one of four possible qualitative categories: (i) evolution leads to a homo- 
geneous configuration, i.e., a configuration in which all cells have the same state; 
(ii) evolution leads to an inhomogeneous fixed point or to a limit cycle; (iii) evo- 
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lution leads to a chaotic succession of configurations; or (iv) evolution leads to 
complex localized spatiotemporal structures that are "sometimes long-lived." 
Although initially conceived for the one-dimensional case, there is in principle 
no reason why such a qualitative classification should not also be applicable to 
higher-dimensional cases. In fact, similar studies for the two-dimensional case 
have appeared as early as in [5]. 

Naturally, class- (iv) update rules are intuitively associated with the real- 
ization of "complex" computations by the cellular automata that are built on 
them, that is, precisely those computations that underlie so much of the inter- 
est in cellular automata as modeling tools. Not surprisingly, then, considerable 
effort has been channeled into finding approaches to automatically categorize 
update rules into the classes (i)-(iv). Formally, all such efforts hover around 
the so-called limit set of an update rule / in the infinite case, which is the 
set of configurations that result from all possible initial configurations after the 
passage of arbitrarily long time. As it turns out, every nontrivial property of a 
limit set (i.e., a property that holds for at least one cellular automaton and does 
not hold for at least one other) can be proven undecidable through a reduction 
from the problem of whether a limit set is a singleton ^Uj , itself known to be 
undecidable [TT] . 

As a consequence of this inherent undecidability, every effective strategy 
for categorizing update rules must necessarily be of a heuristic nature or else 
eventually boil down to a heuristic if it is to have any practical use. Our interest 
in this paper is the study of heuristics that can be coupled with the parallel 
simulation of cellular automata in order to analyze the spatiotemporal patterns 
that emerge, aiming at categorizing the underlying update rule within Wolfram's 
four classes. Efficiency in the form of minimal communications needs is then an 
essential requirement, leading to what we term cell-centric heuristics, that is, 
heuristics that depend as minimally as possible on the exchange of information 
among processors during the simulation of a cellular automaton. 

We start in Section|2|with a review of some of the prominent heuristics that 
have been proposed for automatically classifying update rules, and proceed in 
Section |31 to a discussion of the so-called input-entropy measures. Our cell- 
centric heuristics are presented in Section ^ with results from computational 
experiments on one- and two-dimensional cellular automata given in Section |5| 
Further considerations on the computational results appear in Section El and 
concluding remarks come in Sectional 

2 Background 

In broad terms, we distinguish two essential kernel classes of strategies for the 
categorization of update rules. The first class comprises those techniques that 
aim at extracting the update rules' computational capabilities by solely consid- 
ering the update rule itself, not simulations of cellular automata for examination 
of the resulting spatiotemporal patterns. Approaches of this type have concen- 
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trated on onc-dimcnsional cellular automata, so a cell's neighborhood size is in 
fact S = 2r. 

The pioneering step within this class of approaches was taken by Langton 
[T^ , who proposed to classify an update rule / into the four Wolfram classes by 
examining a single parameter, denoted by A/ and given by 



In 1^, g is the number of distinct (1 + 2r)-tuples on which / outputs ct, where 
a €z S is any of the so-called quiescent states, that is, a is such that /(cr, . . . ,a) = 
a. 

The initial report on the use of the A/ parameter indicated that it behaves 
as an order parameter with respect to which a phase transition occurs: on gen- 
erating update rules / with increasing A/ one first encounters class-(i) behavior, 
then class (ii), then class (iv) around A/ = 0.5, then finally class-(iii) behavior. 
This seemed to suggest that complexity was to be found at the region in the A / 
space that became known as the "edge of chaos." But, in addition to the obvi- 
ous difficulties regarding the existence and choice of a quiescent state, criticism 
regarding the existence and nature of the purported phase transition soon came 
from several sources (cf., e.g., In particular, it now appears that 

update rules belonging to several classes, not just class (iv), are to be found 
near Xf — 0.5. 

Two other interesting approaches have been introduced that are also of the 
same nature in that they also dispense with the need for computer simulations 
of cellular automata. The two approaches share the goal of investigating how 
the information contained in an update rule / affects the overall behavior of 
cellular automata built on /. One of the approaches is topological in nature, 
that is, it seeks to analyze a cellular automaton's global behavior by identifying 
finite-size spatial patterns and characterizing their appearance and evolution 
By contrast, the other one |Ej is algorithmic and aims at characterizing 
update rule / from the perspective of Kolmogorov complexity [181, that is, the 
perspective of the shortest possible description of /. Both approaches relate 
clearly to the Wolfram classification, while at the same time shedding new light 
on it, each from its particular perspective. The latter approach, in addition, 
may also hold a key to some of the incongruities that are inherent to Langton's 
parameter Ay. It is worth remarking, however, that because each of the two 
approaches induces its own class system, neither one is found to relate clearly 
to class (iv). The reader is referred to the original sources for details. 

A wholly distinct class of strategies to categorize cellular-automaton update 
rules concentrates on the examination of space-time patterns as they appear dur- 
ing the evolution of cellular automata from as representative a sample of initial 
configurations as possible. Now, of course, the fact that the cellular automata 
under examination are formally infinite has to be reckoned with; we will come to 
this later, and will for now ignore any difficulties that such infinities may cause 
in practice. We do mention, however, that some successful approaches are built 
from the start on the assumption that n is finite. One example is the "com- 
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putational mechanics" exemplified in which for one-dimensional cellular 
automata draws heavily on finite-state machines |20| derived from patterns in 
the cellular automaton's evolution to characterize the fundamental spatiotem- 
poral features that are inherent to each update rule. 

The approaches in this class that are central to our interest are those that rely 
on some form of entropy measure as the basis of the categorization effort. The 
initial approach along these lines appeared in the same paper that introduced the 
four-class Wolfram classification . Essentially, what it does is to consider the 
probability distribution of space-time blocks as they occur during the evolution 
of cellular automata for a fixed update rule / and then use this distribution to 
define the desired entropies. 

For a more precise characterization, let d > be the number of dimensions 
of the cellular automata in question, and let Xi, . . . , Xd denote numbers of con- 
tiguous cells along each dimension. For T > a number of successive time 
steps during an evolution of the cellular automaton that employs update rule 
/, we need the probability, given /, that an Xi x ■ • • x Xd x T block of states 
appears somewhere in the spatiotemporal trace of the cellular automaton's evo- 
lution. Clearly, the number of possible blocks is s^i- -^<iT^ denote by Pi the 
probability of the ith block, 1 < i < s-^^----^'^'^ . 

Two basic entropies can now be defined. These are the set entropy 

where 9(p) — 1 for p > Q and 9{Q) — 0, and the measure entropy 

E^{Xu...,Xd,T) = -- E P^^oSsP^■ (3) 

From them, we obtain the limiting quantities 

Hf= ^ Hm Ef{X,,...,Xd,T) (4) 



T— >oo 
T/Xi,...,T/Xrf^ 



and 



m= lim E'^{X,,...,Xd,T), (5) 

J X-\ X^ — ^ CIO J 



T/Xi,...,T/X^^o 



respectively. 

The quantity in Q gives the asymptotic rate at which the diversity of spa- 
tiotemporal patterns increases with time, and the one in |SJ represents the 
average amount of "new information" that each new configuration of the cellu- 
lar automaton contributes as time elapses. As it turns out, these quantities (or 
variations thereof obtained by taking the limit exclusively as X\ , . . . , Xd — > oo 
while T is kept constant, or as T — *■ oo while X\, . . . , Xd are kept constant) yield 
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insight into how to categorize /. Coarsely, all indicators vanish for class- (i) cel- 
lular automata and are nonzero for class- (iii) cellular automata. Discriminating 
class (ii) , in turn, requires decoupling space and time: indicators resulting from 
letting T — » oo alone are zero, while those related to letting Xi, . . . , — > cx) for 
fixed T are nonzero. As for class (iv), once again the attempt at identification 
is eluded. 

3 Input entropy 

The concept of input entropy is due to Wuensche |21| and constitutes an at- 
tempt to merge together some of the key features of the two classes of strategies 
discussed in Section [21 those that seek to base update-rule classification on ex- 
amining the update rule solely and those that rely on space-time signatures of 
evolving cellular automata. Given one of the Xi x ■ ■ ■ x Xd x T state blocks of 
that section, we start by considering the probability that, inside the block, each 
of the possible s^^^ inputs to a cell occurs. Denoting the probability of the ?th 
input by Qi, 1 < i < s^^^ , the input entropy is defined by 



The use of the input entropy in practice starts by fixing the values of 
Xi , . . . , Xd and of T and then choosing the Xi . . . Xd cells to be observed during 
simulations of the cellular automaton built on /. Each simulation is conducted 
from a randomly selected initial configuration and runs for time steps, for 
some t+ > T — 1, generating a new configuration at each time step. For each 
time t in the interval [to,i+] with T — 1 < < the probability Qi that 
the ith input occurs within a. Xi x ■ ■ ■ x Xd x T block can be approximated by 
ql/Xi . . .XdT, where q* is the number of occurrences of the ith input within 
the block that ends at time t. The practical entropy figure that stems from (O 
is then, for the block that ends at time t, 



The mean and variance of the quantity in ^ for t — to, ■ . ■ ,t^, that is. 




(6) 



•.(X„...,x.,r) = -i;'(3^^),„g.( 




1 



If{Xi, . . . ,Xd,T) 



t+-to + l 



Y^l]{X^,...,Xd,T) 



(8) 



t=to 



and 



(//(Xi,...,Xd,r)) 



1 



t+ - to + 1 



^ [l}{X^,...,Xd,T)-If{Xi,...,Xd,T)\\ (9) 



6 



respectively, can be used to reveal the Wolfram class to which update rule / 
belongs, after having themselves been averaged over some number of simulations 
for randomly chosen initial configurations. Roughly, for d = 1 it has been found 
that low means and variances bespeak class- (i) or (ii) behavior, while high means 
and low variances indicate a class-(iii) update rule in action. Class-(iv) behavior 
is characterized by medium- valued means and high variances ,21, . 

4 Cell-centric heuristics 

Computing the mean and variance of the input entropy as indicated respectively 
in (jSJ and @ requires simulating the cellular automaton that is based on / for t+ 
time steps and accumulating the quantity given in ^ while an Xi x • • ■ x Xd x T 
block "window" is slid from an initial position that makes the block end at time 
to through a final position at which the block ends at time t^. When the 
simulation is performed in parallel, the Xi . . . X^ cells do not all reside at the 
same processor, so computing the input entropy as given by {T)) for all values 
of t requires a considerable amount of communication involving the processors 
that lodge the cells. 

Given that one processor, call it P, has been singled out for coalescing all the 
information required for computing the mean and the variance, in essence the 
number of integers that needs to be communicated to P is 0{Xt+5), where X is 
the number of cells allocated outside P. In the case of true massive parallelism 
(one cell per processor), this becomes 0{Xi . . . Xdt+S). In general, for each cell 
and each time t, the integers to be communicated are the 1 + S integers needed 
for specifying the input to that cell at time t. If that input is the zth possible 
input, then communicating the 1 + S integers contributes one unit to each of 
g*, . . . ,ql'^'^~^; that is, it contributes to the calculation of the input entropy 
of lO for T blocks (the one ending at time t through the one ending at time 
t + T-1). 

4.1 Cell-centric input entropy 

The crux of the 0{Xt+S) communications requirement is that the logarithm ap- 
pearing in 121) can only be assessed after the contributions to qj have been taken 
into account for all Xi . . . Xd cells, in particular for all the X cells lodged out- 
side processor P. The first step towards obtaining a cell-centric approximation 
to (jJI), one that allows communications requirements to be reduced dramati- 
cally, is to examine more closely the argument to that logarithm and to notice 
that g| /Xi . . . Xd is the average number of occurrences of the ith input per cell 
within the Xi x ■ ■ ■ x Xd x T block. 

Let g^'* denote the number of occurrences of the ith input for the cth cell 
inside the block that ends at time i; clearly, qj = "^fj^i'^"^ ql'* and Q can be 
rewritten as 



7 



If{Xi, . . . , Xd, T) 

i=i 



Xi...Xd c,t ( ^^-^X\...Xd c,t 



c=l y?: 1 , ( 



logs 



Xi...XdT \^ Xi...XdT ) 

Xi...Xdsi + ^ / c.t\ /sr^Xi...Xd c',t\ 



C—1 l—l \ / \ / 

Our cell-centric input entropy is defined by approximating g^'* by its average 
over all cells in the block whenever convenient. That is, we use the approxima- 
tion 

■<r^Xi...Xd c' ,t 

xT...Xd " "'^^ 

in the argument to the logarithm in 11U|) for all c such that 1 < c < Xi . . . Xd- 
We then obtain 

Xi...XdS^+^ / c.t\ / c,t\ 

C}iX,,...,Xd,T)^^-±-^ ^ E %rpog4%rl (12) 



Xi...Xd ^ ^ \ T "Mr 

C— 1 Z— 1 \ / \ / 

which is the cell-centric input entropy of the Xi x • • • x Xd x T block that ends 
at time t. 

The essential question, of course, is whether the cell-centric input entropy 
defined in p2|l still has discriminatory capabilities analogous to those of the 
input entropy, given that the two are related by the approximation in 
The answer to this question is affirmative and is explored in Section [S] through 
the mean Cf{Xi, . . . , Xd, T) and the variance {Cf{Xi, . . . , Xd, T)), defined 
analogously to © and (O, respectively. 

Let us then consider how much communication must be directed towards 
the special processor P during a parallel simulation of a cellular automaton. 
Clearly, a processor Q ^ P lodging Xq cells can calculate its portion of the 
double summation in H12|l for each t (i.e., let c range over its Xq cells) completely 
locally. If N denotes the number of processors, then P has to receive OiXtj^) 
floating-point numbers for the entire simulation. In the limit of true massive 
parallelism, this becomes 0{X^ . . . Xdt^), which relates to the communications 
requirements of the original input entropy by a factor of 0{S) if we disregard 
any differences between sending integers and sending floating-point numbers. So 
using the cell-centric approximation to input entropy saves considerable amounts 
of communication in the current technological reality of iV ^ Xi . . . Xd but 
makes little sense in the limit of true massive parallelism. 



4.2 Cell-centric transition entropy 

We perceive the functional form of H12|l as being suggestive of a host of possible 
different criteria that may be experimented with when attempting to classify 
cellular-automaton update rules. One possibility that we have considered is the 
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following. For a fixed cell c inside the Xi x • • • x Xd x T block that ends at time 
t, let r^'' denote the number of state transitions within the block that cause 
the state of the cell to change during the simulation. This quantity does not 
depend explicitly on the inputs to the cell, so when deriving the corresponding 
cell-centric entropy measure from (I12II . and considering that there are T — 1 
state transitions per cell within the block, we obtain 

Xi...Xd / ct\ / c t \ 

c— 1 \ / \ / 

We call the quantity in (|13|l the cell-centric transition entropy relative to 
the block that ends at time t. Naturally, it shares with the cell-centric input 
entropy all the relevant characteristics that have to do with the parallel simula- 
tion of cellular automata. In addition, as will become apparent in Section it 
also offers interesting glimpses into the categorization of the underlying update 
rule when analyzed from the perspective of its mean Tf{Xi, . . . , Xd, T) and its 
variance cr^ {Tf{Xi, . . . , Xd,T)), defined in analogy to and 0, respectively. 

4.3 Upper bounds 

Upper bounds on the cell-centric input entropy of p2(l and the cell-centric tran- 
sition entropy of H13|l can be established easily if we recall that entropies are 
maximized when all the mutually exclusive events at hand are ascribed the same 
probability. In the case of H12(l this amounts to setting q'^''^ /T to s^^^+''^ for all 
appropriate values of i and c, which yields 

C}{Xu...,Xd,T)<l + S. (14) 

The case of is even simpler, as it suffices to recognize that x log^ x is maxi- 
mized for X — e^^ and to set r"^'*/ (T — 1) to this value for all appropriate values 
of c, thus yielding 

T}{X^,...,Xd,T)<^—. (15) 
•' m s 

But some of the results described in Section |S1 are based on the so-called 
outer-totahstic update rules 0, that is, update rules whose outcomes depend 
not on the cell's individual state and those of its neighbors, but rather on the 
cell's state and the sum of its neighbors' states. For such update rules, and 
considering the s = 2 case only, while the bound given by (|15|l is still correct and 
gives a value slightly above 0.53, in the case of the cell-centric input entropy it 
no longer makes sense to assume a uniform probability distribution on all inputs 
for entropy maximization, and consequently H14|) has to be revised. The correct 
level at which to assume the uniform distribution for entropy to be maximized 
is now the level at which inputs are grouped with one another according to the 
sum of the neighbor states that they comprise. 

For s = 2 the number of distinct such groups is 1 + 5, each corresponding 
to one of the possible sum values, from to 5. The probability associated with 
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each group is then {1 + 5)~^, so each individual input is assumed to occur with 
probability [(1 + 5)n(cr)]^^, where n(cr) is the number of inputs whose states 
sum up to a with < a < S. But n{a) — (^) , so from (fT^ we obtain 

C;(Xi, . . . , Xd, T) < log2(l + 5) + j^^Y.^ log2 (^^ . (16) 

5 Computer experiments 

An infinite cellular automaton cannot be simulated in its entirety, nor can a 
portion of it be simulated for an indefinitely long number of time steps. One 
crucial first decision when planning such a simulation is which contiguous cells 
to observe along each of the d dimensions and also the number of time steps t+ 
during which to perform the simulation. Choosing a finite number of cells to ob- 
serve poses the question of how to handle the boundaries of the observed region, 
since those boundaries affect the simulation but cannot be extended indefinitely. 
Adopting artificial cylindrical boundaries or feeding randomly picked values to 
the boundary cells at each time step of the simulation will not in principle do, 
since this would have direct impact on the assumed infinite and deterministic 
nature of the cellular automaton. 

The solution, naturally, comes from first setting the value of the number of 
steps during which the cellular automaton is to be observed in the simulation, as 
well as the number 1^ of contiguous cells to be observed along the fcth dimension, 
1 < k < d, and then working backwards from them. We start by assuming 
a von Neumann neighborhood and then split a cell's neighborhood 5 into its 
dimension-wise constituents; that is, we write 5 ~ 2(ri + • • • + r^), where each 
Tk is the update rule's radius along the fcth dimension. In order to output the 
state at time i+ of a cell that lies at a boundary along the fcth dimension as if it 
were indeed embedded in an infinite cellular automaton, the states of additional 
Tk off-boundary cells are needed along that dimension at time — 1. The 
number of boundary cells along the fcth dimension at time is 

d 

211^'' (17) 

1 = 1 
l^k 

SO the total number of cells for which states are needed at time i+ — 1 is 

d d d 

^i...£d-|-2^rfc[]^, < J|(4 + 2rfc). (18) 

k=l ' = 1 fc=l 

Note that, for d — \, equality holds in H18|l . The upper bound is useful, though, 
because it generalizes easily as we work backwards through time t = 0, revealing 
that initial states are needed for cells that number no more than 

d 

[](4+2rfci+). (19) 

fe=i 
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The upper bound in l|19|l is clearly an exaggeration for d > 1 under a von 
Neumann neighborhood, since it corresponds to an {£i + 2rit+) x ■ ■ ■ x {£d + 
2rdt+) patch of cells. Here we only point out that specifying the precise cells 
whose initial states are needed is totally feasible, however cumbersome their 
determination or the inner mechanics of a parallel simulation involving exactly 
those cells and no others. In any event, we now have a set of cells that can be 
simulated through time t+ with the certainty that the observed behavior of the 
core ii . . . id cells is fully compatible with the assumption of an infinite cellular 
automaton and of the deterministic character of its update rule. Boundaries 
still exist with respect to the extended set of cells, but the way they are handled 
is now immaterial. Either cylindrical boundaries may be assumed or randomly 
picked states may be used to fill up the inputs needed by the boundary cells. 
The effects of cither choice can only affect the states of the core cells after time 
t+. 

When we assume a Moore neighborhood to start with, we write 6 = {1 + 
2ri) . . . (1 + 2rd) — 1 instead, so that the number of cells for which states are 
needed at time t+ — 1 is exactly the upper bound appearing in H18|l . In this 
case, clearly the number of cells given by H19() is no longer an exaggeration, but 
expresses precisely what is needed. 

In Figure n we provide an illustration of these issues in the two-dimensional 
case when £i = 3 and ^2 = 4 with ri = 2 and r2 = 1. What is shown is the set of 
cells for which initial states are needed if the states of the shaded cells are to be 
observed for t+ = 3 further time steps as if those cells were part of an infinite, 
deterministic cellular automaton. Cells enclosed within the thick solid contour 
are those for which initial states are needed in the case of a von Neumann 
neighborhood. Those enclosed with the thick dashed contour must have initial 
states specified if a Moore neighborhood is used. Our practice henceforth is to 
employ sets of cells of the latter type regardless of the neighborhood type in 
use. 

5.1 The value of T 

When a cellular automaton is simulated with the goal of computing the input 
entropy of Section |21 or one of the cell-centric quantities of Section ^ inside 
an JY'i X • • • X Xd X T state block, the core set of observed cells is such that 
£i — Xi, . . . ,£d = Xd- In this section, we discuss the choice of T that maximizes 
the discriminatory capabilities of our cell-centric heuristics in the context of the 
Wolfram classes. We henceforth assume S' = {0,1}, i.e., s = 2. 

Our approach has been to perform a set of initial experiments with t+ = 
500 on a single processor and to analyze their outcomes aiming at finding a 
suitable T value for use in the main experiments. We ran four sets of initial 
experiments; one for d = I and ri — 2, one for d = 1 and ri = 3, one for 
d = 2 under a von Neumann neighborhood with r"! = r2 = 1, and one last for 
d = 2 under a Moore neighborhood with ri = r2 = 1. For the one-dimensional 
cases, each set comprised four runs for Xi = 150 and four runs for Xi ~ 300; 
for the two-dimensional cases, we did four runs for each of Xi ~ X2 ~ 15 and 
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Figure 1: The 15 x 10 patch of cells for which initial states are needed in the 
two-dimensional case of £i — 3, £2 ^ 4, ri = 2, and r2 ~ 1, so that the shaded 
cells may be observed correctly for i+ = 3. The thick-line enclosures refer to 
the minimal sets of cells that are needed under a von Neumann (solid line) or a 
Moore (dashed hue) neighborhood. 



Xi = X2 = 30. Within each set, the first run corresponds to a known class-(i) 
update rule, the second to a known class-(ii) update rule, and so on. The known 
update rules we used are detailed in Tabled 

In Tabled the update rules are specified according to the following conven- 
tions. For the one-dimensional experiments, each update rule is the hexadecimal 
form of the binary number whose most significant bit is the update rule's output 
to the input 11 ... 1, read left to right, the next bit corresponds to 11 ... 1 — 1, 
and so on (cf., e.g., |221). The two-dimensional von Neumann case is simi- 
lar, except that the most significant bit corresponds to 00 ... 0, the next one 
to 00 . . . -I- 1, and so on, inputs being read in the self-north-east-south- west 
order [SS]. The two-dimensional Moore case comprises outer-totalistic update 
rules only, for which we adopt Conway's Life |^ ESI usual notation style: hx 
indicates that the cell's state moves from to 1 (it "is born" ) if the cell has x 
neighbors in the 1 state; sec means that the cell's state remains 1 (it "survives") 
if the cell has x neighbors in the 1 state; in all cases not listed explicitly, the 
cell's state becomes 

Within each run, the cellular automaton is simulated for T = 5, 10, . . . , 250 
and for each simulation the mean and variance indicators of Section 0] are com- 
puted. Simplifying the notation in the obvious way, these are Cf, a^{Cf), Tf, 
and u'^{Tf), f being the update rule under consideration. All simulations shar- 
ing the same value of d, Xi, . . . ,Xd, and ri, . . . ,rd start at the same initial 
configuration, itself generated at the beginning for that group of simulations 
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Table 1: Update rules used to generate the plots in Figures |5HS| 



Experiment 




Class 


Update rule 


d= 1, 




(i) 


Id000a20 


ri = 2, 




(ii) 


01dc3610 


Figures EI a-d) and^a- 


-d) 


(iii) 


994a6a65 






(iv) 


6cle53a8 


d=l, 




(i) 


IdfOOOOOOOOf 00000000000000000020 


ri = 3, 




(ii) 


7fdc3610fc48472c01dc361001dc3660 


Figures |2te-li) and^e- 


-h) 


(iii) 


994f6a65994a6a65a94a6a65994a6a99 






(iv) 


3b469c0ee4f7f a96f93b4d32b09ed0e0 


d = 2, 




(i) 


00000601 


n = r2 = 1, 




(ii) 


06900600 


von Neumann, 




(iii) 


69969669, Fredkin2 


Figures Efa-d) andEfa- 


-d) 


(iv) 


6db6f ac8, Crystal2 EH! 


d = 2, 




(i) 


b3b6b7 s3s6s7s8 


ri = r2 1, 




(ii) 


b3 s2s5s6 


Moore, 




(iii) 


blb3b5 sls3s5 


Figures Ol^e-li) andEl^e- 


-h) 


(iv) 


b3 s2s3 



by randomly choosing initial states for the {Xi + 2ri<+) . . . {Xd + 2rdt+) cells 
involved. 

The results of these initial experiments are given in Figures|5|through[Sl and 
also in Appendix^ where plots of spatiotemporal patterns are given for selected 
runs. Figures El and 01 refer, respectively, to the behavior of the cell-centric input 
entropy for the one- and two-dimensional cases as T varies. Figures 01 and [31 
in turn, refer to the behavior of the cell-centric transition entropy for the one- 
and two-dimensional cases, respectively, as T varies. Notice that, but virtue of 
our experiments' setup, the plots in Figures El and 01 for which ri and Xi have 
the same values correspond to the same initial configuration, and similarly for 
Figures O and 

Even though this first set of experiments is deprived of statistical significance 
(based as it is on runs from a single initial configuration), it provides an initial 
indication of the discriminatory capabilities of our cell-centric heuristics. In fact, 
an examination of all mean- and variance-plot pairs in Figures EHSl reveals that, 
with a few exceptions, classes (i)-(iv) can in the worst case be discriminated 
within roughly one order of magnitude by either the mean or the variance of 
both the cell-centric input and transition entropies for most values of T. For 
example, comparing the plots in Figures E^a) and (b) indicates that the mean 
cell-centric input entropy provides good discrimination among the four classes, 
except between classes (iii) and (iv) , which nonetheless can be told apart easily 
by the variance of that entropy. The exceptions are the two-dimensional cases 
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Figure 2: Mean (C/) and variance ((T^(C/)) of the cell-centric input entropy as 
a function of T under four different update rules, one from each of classes (i) 
through (iv), for d= 1. Data are given for the 150-cell case with n = 2 (a and 
b), the 300-cell case with n = 2 (c and d), the 150-cell case with n = 3 (e and 
f), and the 300-cell case with n = 3 (g and h). 
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Figure 3: Mean (C/) and variance ((t^(C/)) of the cell-centric input entropy as 
a function of T under four different update rules, one from each of classes (i) 
through (iv), for d = 2. Data are given for the (15 x 15)-cell von Neumann case 
(a and b), the (30 x 30)-ccll von Neumann case (c and d), the (15 x 15)-cell 
Moore case (e and f), and the (30 x 30)-cell Moore case (g and h). In all cases, 
n = r2 = 1. 
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Figure 4: Mean (T/) and variance (f7^(T/)) of the cell-centric transition entropy 
as a function of T under four different update rules, one from each of classes (i) 
through (iv), for d= 1. Data are given for the 150-eell ease with n = 2 (a and 
b), the 300-cell case with n = 2 (c and d), the 150-cell case with n = 3 (e and 
f), and the 300-cell case with n = 3 (g and h). 
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Figure 5: Mean (T/) and varianee (a'^iTf)) of the cell-centric transition entropy 
as a function of T under four different update rules, one from each of classes 
(i) through (iv), for d = 2. Data are given for the (15 x 15)-cell von Neumann 
case (a and b), the (30 x 30)-cell von Neumann case (c and d), the (15 x 15)-cell 
Moore case (e and f), and the (30 x 30)-cell Moore case (g and h). In all cases, 
n = r2 = 1. 
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with Moore neighborhoods, in which neither of the cell-centric heuristics seems 
to be able to capture the distinction between classes (i) and (ii). 

We return to this discussion of our heuristics' discriminatory capabilities 
shortly, after we have provided more significant data. But returning to the orig- 
inal goal of this initial set of experiments, we see from the plots in Figures |2H3 
that several possibilities exist for choosing a value for T. We note that, natu- 
rally, choosing as small a value as possible has the advantage of alleviating the 
processing demands for computing the entropy figures. With these observations 
in mind, our choice hereafter is to use T — 25. 

5.2 Experimental results 

Simulating a cellular automaton in parallel is essentially an exercise in de- 
signing a simple synchronous distributed algorithm, in the sense described in 
|26| . employing for synchronization the technique of a-synchronization of |27) . 
Within this general framework, several proposals have been put forward (cf., 

e.g., |2HlEnillI|)- 

Our parallel simulator is no exception and has been designed and imple- 
mented within this same framework for one- and two-dimensional cellular au- 
tomata. Each simulation is initiated by partitioning the automaton into the 
N available processors. The hardware we have used in all the experiments 
described henceforth has = 8. All processors have the capability of commu- 
nicating directly with all others. For d = 1, the cells are partitioned equitably 
among the processors in such a way that each processor receives a contiguous 
set of cells to simulate; for d = 2, the automaton is subdivided into rectangles 
of contiguous cells by slicing it equitably along the dimension that has the least 
number of cells. 

Notice that the neighborhood relation among cells as given by the lattice that 
underlies the cellular automaton automatically implies a neighborhood relation 
among processors, too. Specifically, two processors are neighbors whenever at 
least one cell that one of them lodges is a neighbor of a cell lodged by the other. 
Obviously, some of the cells that a processor lodges are distinguished in that 
their states are needed by the processor's neighbors; we refer to such cells as 
frontier cells. 

The simulation proper starts at each processor with the assignment of a 
randomly chosen initial state to each of the cells it lodges and the sending of 
the initial states of all frontier cells to the neighbor processors at which they 
are needed. The processor then iterates as t is incremented from through t+: 
for each t, new states are computed for all the cells that the processor lodges 
and so are the portions of l|12|) and corresponding to those of its cells that 
are observed, provided t > T —1; then the new states of the processor's frontier 
cells are sent where they are needed. At the end, each processor that lodges at 
least one observed cell forwards its 2(t+ — T + 2) entropy results to a previously 
designated processor for computation of the two means and variances (viz. the 
means Cf and T/ and the variances a^{Cf) and cr'^iTf)). 
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One-dimensional cellular automata 

Our setup for the one-dimensional experiments is based on either ri = 2 or 
ri = 3. For n = 2, the setup has Xi = 2000, t+ = 500, and T = 25. The 
overall number of cells to simulate is then Xi + 2rit+ = 4000, so the number of 
observed cells constitutes half of the total. For ri = 3, our setup has Xi — 2400, 
t+ = 400, and T = 25. In this case, the total number of cells in the simulation 
is 4800, and once again the observed cells account for half the total number of 
cells. 

In the one-dimensional case, the number of distinct update rules is given by 
22'^''"' , that is, 2^2 for ri = 2 and 2^"^^ for n = 3. Our results are based on 50000 
update rules randomly chosen out of those and are shown in Figure as plots 
of the variances a'^{Cf) and a^(Tf) against the means C/ and Tf, respectively. 
The points that correspond to the one-dimensional update rules of Table ^ are 
not shown explicitly but are singled out by indications, at their coordinates, of 
the classes to which the update rules belong. 

One crucial information that has been left out of the plots in FigureElin order 
to avoid any further cluttering is the density of points at any particular mean- 
variance region. We provide some of this information next. First we choose, for 
each of the plots, a value for the mean entropy that separates the update rules 
labeled (i) or (ii) from those labeled (iii) or (iv). In Figures EJa) and (b), which 
refer to the cell-centric input entropy, this mean entropy can be taken to be 1; 
in FiguresEfc) and (d), it can be taken to be 0.1. Selecting this value partitions 
the plot into two regions and for each one we now select an entropy variance 
that can be used to separate the update rules labeled (i) and (ii) on the left, 
and another that can likewise be used for those labeled (iii) and (iv). In parts 
(a) and (b) of the figure, our choices are 0.001 and 0.1, respectively on the left 
and right sides; in parts (c) and (d) the corresponding values are 0.0001 and 
0.001. At the end, in each plot we are left with a partition into four regions, 
each containing exactly one of the update rules labeled (i)-(iv). 

We may then provide the missing information. In part (a), 2.30% of the 
update rules are inside the (i) region, 8.10% in the (ii) region, 86,96% in the 
(iii) region, and 2.64% in the (iv) region. Part (b) contains no update rules 
inside the (i) region, 0.80% of the update rules in region (ii), 97.85% in (iii), and 
1.35% in (iv). In part (c) we have the figures 2.54%, 9.30%, 83.47%, and 4.69%. 
In part (d) we once again have no update rules inside the (i) region and the 
remaining figures are 0.80%, 97.59%, and 1.61%. The well-known preponderance 
of class-(iii) update rules, as well as the relative rarity of class-(iv) update rules, 
particularly as ri is increased from parts (a) and (c) to parts (b) and (d), are 
then confirmed. 

Two-dimensional cellular automata 

For the two-dimensional experiments we use ri = r2 ~ I throughout. Regardless 
of the neighborhood type (von Neumann or Moore), our experiments' setup has 
Xi = X2 = 100, t+ — 50, and T — 25. The total number of cells to be simulated 
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Figure 6: Occurrence of mean-variance pairs within the general experimental 
setup for one-dimensional cellular automata. Data are given for the cell-centric 
input entropy with ri — 2 (a) and ri = 3 (b) as plots oia^{Cf) against C/, and 
also for the cell-centric transition entropy with ri = 2 (c) and ri = 3 (d) as plots 
of (J^{Tf) against Tf. Each plot contains 50000 points, each point corresponding 
to a randomly chosen update rule and to an average over 5 randomly chosen 
initial configurations. The one-dimensional update rules of Table ^ a-re also 
shown within the same experimental setup, but not as points: instead, they are 
singled out with an indication at their coordinates of which of classes (i)-(iv) 
they belong to. 
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is therefore {Xi + 2r'it+)(X2 + 2r2t+) = 40000, so the number of observed cells 
is one quarter of the total number. 

Similarly to the one-dimensional case, in the two-dimensional case with von 
Neumann neighborhoods there are 2^ ^ ' ^' — distinct update rules. With 
Moore neighborhoods, and considering only outer-totalistic update rules, the 
number of distinct update rules is 2^(^+^''i)(^+^'''2) = 2^^. Once again, in both 
cases our results are based on 50000 update rules randomly chosen out of the 
corresponding sets."'^ They are shown in the plots of Figure[7|in the same style as 
Figureini As in the case of that figure, the marginal indications (i)-(iv) give the 
coordinates at which the points corresponding to the two-dimensional update 
rules of Table^'would be found, had they been plotted explicitly. Parts (a) and 
(c) of the figure refer to a von Neumann neighborhood, parts (b) and (d) to a 
Moore neighborhood. 

Once again additional information regarding the density of points in the four 
plots must be given on the side. Following the same methodological steps as for 
the one-dimensional cases, first we select a mean-entropy value for each plot to 
separate the update rules labeled (i) or (u) from those labeled (in) or (iv), and 
then we select an entropy- variance value to separate each pair of labeled update 
rules. As the four plots in Figure [7| indicate, this may prove a harder task than 
in the one-dimensional cases, since it is now common to find two or more labels 
clustered together along one of the axes. 

Let us begin with part (a), which refers to the cell-centric input entropy 
under a von Neumann neighborhood. If we select 1.73 as the first separator, and 
then select 0.0005 on the left and 0.1 on the right, then we are left with no update 
rules lying within class-(i) region, while 7.92% of the update rules are in region 
(ii), 90.47% in region (in), and 1.61% in region (iv). Moving to part (b) we select 
the separators 1, 0.007, and 0.1, which yields the percentages 4.42%, 12.08%, 
77.19%, and 6.31%, respectively for regions (i)-(iv), relative to the cell-centric 
input entropy under a Moore neighborhood. The remaining two plots, in parts 
(c) and (d), are both relative to the cell-centric transition entropy, respectively 
under a von Neumann and a Moore neighborhood. Selecting the separators 0.4, 
0.0002, and 0.0001 in the former case yields 6.67%, 16.07%, 73.27%, and 3.98%. 
In the latter case, we select 0.1, 0.0072, and 0.0001, obtaining 19.02%, 1.82%, 
70.74%, and 8.42%. As in the one-dimensional case, indications are once again 
clear concerning the relative predominance and rarity of classes (iii) and (iv), 
respectively. 

^Note that restricting Moore-neighborhood update rules to lie within the set of outer- 
totalistic update rules is a means to ensure that these 50000 samples have some statistical 
representativeness. In the absence of this restriction, the number of possible update rules 
becomes 2^ ^ ^ ^ ^ . This number, with values for ri and r2 as we have adopted, is 
2^^^, which is larger by more than a hundred orders of magnitude than the number of distinct 
update rules in any of our other experiments. 
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Figure 7: Occurrence of mean-variance pairs within the general experimental 
setup for two-dimensional cellular automata. Data are given for the cell-centric 
input entropy with the von Neumann (a) and the Moore (b) neighborhoods 
as plots of cr^(C/) against C/, and also for the cell-centric transition entropy 
with the von Neumann (c) and the Moore (d) neighborhoods as plots of cr^(T/) 
against Tf. Each plot contains 50000 points, each point corresponding to a 
randomly chosen update rule and to an average over 5 randomly chosen initial 
configurations. The two-dimensional update rules of Table ^ are also shown 
within the same experimental setup, but not as points: instead, they are singled 
out with an indication at their coordinates of which of classes (i)-(iv) they 
belong to. 
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6 Discussion 



The data shown in Figures and |7| respectively for one- and two-dimensional 
cellular automata, tend all to exhibit the following behavior regarding classes 
(i)-(iv). When plotted on doubly-logarithmic scales, they appear clustered 
roughly as a boomerang whose traversal from the low-mean, low-variance tip 
leads us through class- (i) update rules, then class (ii), then class (iv) near the 
middle bend, and finally class (iii) past the bend. Entropy means increase at 
varying rates along the traversal, while variances increase at first but fall down 
again after the middle bend in the cluster's shape. 

The one-dimensional cases, depicted in FigureEl indicate unequivocally that 
the eight one-dimensional update rules of Tabled can be told apart by at least 
one full order of magnitude of the mean entropy or the entropy variance, often 
both, regardless of which cell-centric entropy is being used. While the same 
holds unchanged for the case shown in Figure [TJa) , which refers to the cell- 
centric input entropy under a von Neumann neighborhood, the remaining three 
parts, (b)~(d), must be examined in more detail. The case of part (b), in which 
the input entropy is still the one in use but now under a Moore neighborhood, 
allows proper separation between classes (iii) and (iv), but apparently leave 
classes (i) and (ii) mixed up together. The picture as we move to part (c), 
which corresponds to the cell-centric transition entropy under a von Neumann 
neighborhood, is once again subject to mix-ups, this time between classes (ii) 
and (iv). The final case is that of part (d), corresponding to the transition 
entropy and to a Moore neighborhood. In this case, as in the case of part (b), 
classes (i) and (ii) are hard to tell apart. 

We envisage two major trends underlying these class mixtures in the mean- 
variance plots for the two-dimensional cases. The first one has to do with part 
(c) of Figure [71 where a mix-up of classes (ii) and (iv) under a von Neumann 
neighborhood turns up when the transition entropy is used. What may be 
happening is that the relatively low value of 50 chosen for (for the strictly 
practical reason of keeping our run times within reasonable bounds while the 
plots of the figure were produced) is insufficient for the automaton to settle 
into a more typical class- (ii) behavior (and hence mean and variance values of 
the transition entropy that are commensurate with that class). This is some- 
what supported by an examination of Figure |51 but clearly calls for additional 
investigation (more on this in Section [TJ. 

The second trend concerns parts (b) and (d) of Figuredmainly, and thus has 
to do with the mix-up between classes (i) and (ii) when a Moore neighborhood 
is in use, but may also be related to the case of part (c) we just discussed. 
Our expectation that the Wolfram classification carries on naturally to the two- 
dimensional case comes from Wolfram's own investigations on two-dimensional 
cellular automata 9 , but this has been challenged on the grounds that such a 
classification scheme fails to recognize the real sign of complex behavior in outer- 
totalistic, two-dimensional update rules, which is the presence of the so-called 
gliders, that is, the structures that are seen to "glide" across the two-dimensional 
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lattice as time elapses If this is the case, then what is out of place is 

not the mix-up of classes (i) and (ii), but rather the separation between the two 
classes, since none of the two exhibits gliders and should therefore be coalesced 
together into one single class. 

But beyond these slight conflicts, and whichever of the competing trends may 
win at the end, we perceive our experiments' outcomes as expressed in FiguresEl 
and [7| as laying out an overall methodology for the classification of cellular- 
automaton update rules, one that in many senses confirms the initial conclusions 
of [21] ■ First and foremost is an examination of the mean entropies vis-a-vis 
the bounds made available in (|14|l through For the one-dimensional cases, 
(|14l) predicts that no mean cell-centric input entropy goes beyond 1 -|- 2ri , while 
predicting 1 + 2(ri + ^2) as the maximum for the von Neumann two-dimensional 
case. Thus, the upper bound turns out to be 5 in the case of part (a) of 
FigureEl 7 for part (b), and 5 for FigureCfa). The tighter number given by fTH|l 
for outer-totalistic, Moore-neighborhood update rules yields approximately 6.87 
for Figure EJb). While the four plots respect the corresponding upper bounds 
(this may not be immediate from the figures, owing to the logarithmic scale, but 
we know it from our files and refrain from presenting further plots), none of the 
50000 randomly chosen update rules comes very near its bound. Perhaps this 
is due to the difficulty of sampling an update rule whose mean input entropy 
comes sufficiently near the bound, but the fact remains that comparing an 
update-rule's mean input entropy to its known upper bound may be of little 
help towards classifying the update rule. 

The case of parts (c) and (d) of both Figures El and being as they are 
based on the cell-centric transition entropy, is different. In this case, meeting 
the upper bound of approximately 0.53 given by (|15|l does not seem to depend 
on serendipitously finding any particular update rule. In fact, update rules 
whose mean transition entropy approach the bound closely occur frequently, as 
once again can be seen in the figures. When using the transition entropy, then, 
a useful first step is to compare the update rule's mean entropy with this bound: 
if close enough, almost surely the update rule is a class-(iii) one. 

Beyond this initial test against known upper bounds, what remains of the 
aforementioned overall methodology is essentially a cladistics-like'^ buildup of re- 
lationships among update rules given their cell-centric (input and/or transition) 
entropy means and variances. The crux here is that classification is the product 
of comparison, thence the fundamental importance of update rules such as the 
ones in Table for which we are capable of providing a desired classification a 
priori so they can function as seeds in the larger classification process. 

■^More generally, there have been arguments calling for classification schemes that take the 
particular application area under study into consideration more seriously (cf., e.g., [151 ). 

•^Here we allude to the method known as cladistics for hypothesizing relationships among 
(extant or extinct) organisms. Beyond its core assumptions, the method in essence relies on 
examining several characters of the organisms and employing them for grouping the organisms 
into the desired taxa ,33;'- 
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7 Conclusions 



We have in this paper addressed the automatic classification of the update rules 
of cellular automata. Our departing point has been the notion of input entropy, 
on which we built by the introduction of two novel entropy measures, both 
inspired by, and targeted at, the simulation of cellular automata by message- 
passing parallel machines. Our two new measures are the cell-centric input 
entropy and the cell-centric transition entropy. For both of them we provided 
extensive experimental results on both one- and two-dimensional cellular au- 
tomata. Within our assumed classification context, that of Wolfram's four-class 
scheme, these results demonstrated that the two new measures provide satisfac- 
tory discriminatory capabilities in the one-dimensional case, while in the two- 
dimensional case it is also a good discriminator but in addition helps support 
other authors' suggestions that a better classification scheme may be needed. 

Our experimental results were the product of a parallel implementation of a 
simulator coupled with a module for calculating the two cell-centric entropies. 
We finalize by commenting on some performance-related aspects of this simu- 
lator. The results presented in Section |S1 were obtained on an eight-computer 
cluster, each based on an Intel Pentium 4 processor running at 1.8 GHz and 
having 1 gigabyte of memory. The eight computers arc fully interconnected by 
a gigabit-ethernet switch. On this cluster, each of the eight test suites of Sec- 
tion comprising 5 independent runs for each of 50000 update rules, requires 
somewhere from three to six days to complete, depending on which of the four 
update-rule categories (one-dimensional with two possible radii, von Neumann 
two-dimensional, Moore two-dimensional) and which of the cell-centric measures 
(input or transition entropy) are being used. 

The fact that we are simulating infinite cellular-automata, as explained right 
at the beginning of Section [51 is naturally the source of considerable load im- 
balance among the processors. We have paid no heed to this issue, but clearly 
it has to be reckoned with by anyone undertaking the parallel simulation of 
large-scale cellular automata if the effects of infinite boundaries are to be taken 
into account. There are two kinds of load imbalance to be considered. First is 
the fact that only those processors that lodge some of the observed cells do ac- 
tually perform entropy-related calculations; among these, those that lodge more 
of those cells are more loaded by that kind of computation. Secondly, cells that 
are not observed but do nonetheless participate in the simulation for the sole 
sake of providing the illusion of an infinite cellular automaton do not have to be 
simulated for all the t+ steps; instead, as time elapses progressively less of such 
cells need to be simulated. Once these two types of load imbalance are taken 
into account, there are all sorts of policies that can be adopted to re-balance the 
computational load among the processors. We dwell on the issue no further in 
this paper, but it is clearly important and should be considered upon embarking 
in a more performance-aware implementation. 

Another important aspect that ultimately is closely related to these per- 
formance issues is whether the need really exists to undertake the simulation 
of cellular automata with all the extra load for providing the illusion of infin- 
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ity. While unquestionably this seems the right way to approach the simulation 
when a new classification scheme is first being tested, perhaps once it is es- 
tablished the infinity requirement may be dropped and cylindrical boundary 
conditions adopted instead. We have performed a few experiments with this 
trade-off in mind; their outcomes are shown in Table |21 Examining the table 
carefully reveals clearly that both of our cell-centric heuristics retain the same 
discriminatory capabilities we found them to possess in Section [SJ even though 
occasionally the relative positioning of the classes with respect to the mean or 
variance of some entropy may not be the same, possibly due to the different 
numbers of cells in the two sets of experiments. In fact, a quick examination 
of Figures ITUl and ITTl which depict the spatiotemporal patterns of some cellular 
automata with cylindrical boundaries, reveals the same features we have come 
to associate with classes (i)-(iv), despite the artificial periodicity that appears 
in some cases as a result of assuming finite boundaries. But, as demonstrated 
by the results in Table El such periodicity appears to have no noticeable effect 
on our cell-centric entropies. One immediate consequence of this is that con- 
siderably larger cellular automata can now be simulated with the same overall 
processing effort, and also that the sources of load imbalance we discussed earlier 
become moot. Likewise, the simulation of two-dimensional cellular automata 
for significantly larger values of becomes more viable, which perhaps may 
lead to a clarification of the mix-up between classes (ii) and (iv) under a von 
Neumann neighborhood alluded to in Sectional 
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Table 2: Means and variances averaged over 5 runs from randomly chosen initial 
configurations with cylindrical boundaries for tj^. = 500 and T — 25. Experiment 
codes are as follows. I: d — 1, 150 cells, ri = 2; II: d = 1, 300 cells, ri — 2; 
III: d = 1, 150 cells, n = 3; IV: d = 1, 300 cells, n = 3; V: d = 2, von 
Neumann neighborhood, 15 x 15 cells, ri = r2 = 1; VI: d — 2, von Neumann 
neighborhood, 30 x 30 cells, ri ~ r2 ^ 1; VII: d — 2, Moore neighborhood, 
15 X 15 cells, n = r2 = 1; VIII: d = 2, Moore neighborhood, 30 x 30 cells, 
ri = r2 = 1. Update rules are as given in Table for each of classes (i)-(iv). 
Numbers are truncated to six decimal places. 



Experiment 




Cf 






(i) 


(ii) 


(iii) 


(iv) 


I 


0.000419 


0.004730 


4.671842 


2.726404 


II 


0.001030 


0.023110 


3.997883 


2.396267 


III 


0.000170 


0.685005 


5.757584 


3.195973 


IV 


0.000146 


0.723408 


5.755847 


3.527819 


V 


0.000214 


1.434876 


3.692984 


1.854994 


VI 


0.000245 


1.543048 


3.843658 


1.506388 


VII 


0.066639 


0.110987 


4.268379 


1.206355 


VIII 


0.134897 


0.148938 


4.431762 


1.452313 












(i) 


(ii) 


(iii) 


(iv) 


I 


0.000039 


0.000917 


0.000102 


0.442224 


II 


0.000251 


0.009366 


0.000273 


0.250179 


III 


0.000010 


0.003644 


0.000166 


0.749215 


IV 


0.000008 


0.004534 


0.000076 


0.686746 


V 


0.000019 


0.000068 


0.000195 


0.444950 


VI 


0.000023 


0.000045 


0.000058 


0.721707 


VII 


0.010898 


0.032257 


0.000031 


0.805210 


VIII 


0.009618 


0.021399 


0.000009 


0.352702 
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Table 2: (Continued). 

Experiment Tj 

(i) (ii) (iii) (iv) ~ 

I 0.000642 0.006438 0.474852 0.340475 

II 0.000643 0.003924 0.474737 0.389267 

III 0.000307 0.137617 0.489661 0.355899 

IV 0.000310 0.102894 0.490509 0.290770 

V 0.000237 0.335283 0.469148 0.231000 

VI 0.000237 0.366104 0.484582 0.185903 

VII 0.002447 0.002894 0.484342 0.042994 

VIII 0.002582 0.003393 0.484713 0.094744 

(i) (ii) (iii) (iv) 

I 0.000080 0.000779 0.000010 0.005762 

II 0.000079 0.000433 0.000007 0.002972 

III 0.000031 0.000544 0.000009 0.006885 

IV 0.000032 0.000585 0.000004 0.005747 

V 0.000019 0.000005 0.000001 0.012932 

VI 0.000019 0.000003 0.000000 0.016717 

VII 0.000473 0.000532 0.000010 0.010095 

VIII 0.000497 0.000609 0.000002 0.009627 
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A Selected spatiotemporal patterns 



In this appendix, we provide illustrations of the spatiotemporal patterns result- 
ing from some of the evolutions based on the update rules of Table In all 
illustrations, the color white is associated with the state, the color black with 
the 1 state. All spatiotemporal plots are framed for increased ease of reference. 

The first set of illustrations corresponds to some of the evolutions to which 
Figures I^HHl refer. They are therefore for infinite cellular automata and cor- 
respond to the evolutions of the observed cells, that is, the cells whose states 
contributed to the entropy calculations. This set is shown in Figures |H1 and O 
respectively for the one- and two-dimensional cases. 

A similar second set of illustrations depicts the evolution of the same cells, 
but now under cylindrical boundaries, following our remarks in Section[7| These 
are given in Figures 1101 and 1111 respectively for the one- and two-dimensional 
cases. They are related to the data shown in Table [3 only in principle, because 
they correspond, for the sake of comparison, to initial configurations that match 
those used for the infinite cases. 
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Figure 8: Sample spatiotemporal patterns for the update rules of Tabled in the 
infinite, d = 1 cases with 150 cells observed for 500 time steps. Each plot displays 
cell states horizontally for each time step; time grows from top to bottom. The 
topmost row of plots corresponds to ri =2, the bottommost to ri =3. Within 
each row, from left to right, the update rules belong each to classes (i)-(iv). 
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Figure 9: Sample spatiotemporal patterns for the update rules of Tabled in 
the infinite, d ~ 2 cases with 30 x 30 observed cells. Each plot displays a 
configuration during the evolution of the automaton. The topmost three rows 
of plots are relative to the von Neumann update rules, the bottommost three 
rows to the Moore update rules. Within each triple of rows, the topmost row 
corresponds to i = 0, the middle one tot = 125, and the bottommost tot — 250. 
Within each row, from left to right, the update rules belong each to classes (i)- 
(iv). 
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Figure 10: Sample spatiotemporal patterns for the update rules of Table ^ in 
the cylindrical, d — 1 cases with 150 cells observed for 500 time steps. Each 
plot displays cell states horizontally for each time step; time grows from top to 
bottom. The topmost row of plots corresponds to ri ~ 2, the bottommost to 
ri — 3. Within each row, from left to right, the update rules belong each to 
classes (i)-(iv). 
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Figure 11: Sample spatiotemporal patterns for the update rules of Tabled in 
the cylindrical, d — 2 cases with 30 x 30 observed cells. Each plot displays a 
configuration during the evolution of the automaton. The topmost three rows 
of plots are relative to the von Neumann update rules, the bottommost three 
rows to the Moore update rules. Within each triple of rows, the topmost row 
corresponds to i = 0, the middle one to t = 125, and the bottommost to t = 250. 
Within each row, from left to right, the update rules belong each to classes (i)- 
(iv). 
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